454,912 research outputs found
Evaluation Measures for Hierarchical Classification: a unified view and novel approaches
Hierarchical classification addresses the problem of classifying items into a
hierarchy of classes. An important issue in hierarchical classification is the
evaluation of different classification algorithms, which is complicated by the
hierarchical relations among the classes. Several evaluation measures have been
proposed for hierarchical classification using the hierarchy in different ways.
This paper studies the problem of evaluation in hierarchical classification by
analyzing and abstracting the key components of the existing performance
measures. It also proposes two alternative generic views of hierarchical
evaluation and introduces two corresponding novel measures. The proposed
measures, along with the state-of-the art ones, are empirically tested on three
large datasets from the domain of text classification. The empirical results
illustrate the undesirable behavior of existing approaches and how the proposed
methods overcome most of these methods across a range of cases.Comment: Submitted to journa
On Maximum Margin Hierarchical Classification
We present work in progress towards maximum margin hierarchical classification where the objects are allowed to belong to more than one category at a time. The classification hierarchy is represented as a Markov network equipped with an exponential family defined on the edges. We present a variation of the maximum margin multilabel learning framework, suited to the hierarchical classification task and allows efficient implementation via gradient-based methods. We compare the behaviour of the proposed method to the recently introduced hierarchical regularized least squares classifier as well as two SVM variants in Reuter's news article classification
On Horizontal and Vertical Separation in Hierarchical Text Classification
Hierarchy is a common and effective way of organizing data and representing
their relationships at different levels of abstraction. However, hierarchical
data dependencies cause difficulties in the estimation of "separable" models
that can distinguish between the entities in the hierarchy. Extracting
separable models of hierarchical entities requires us to take their relative
position into account and to consider the different types of dependencies in
the hierarchy. In this paper, we present an investigation of the effect of
separability in text-based entity classification and argue that in hierarchical
classification, a separation property should be established between entities
not only in the same layer, but also in different layers. Our main findings are
the followings. First, we analyse the importance of separability on the data
representation in the task of classification and based on that, we introduce a
"Strong Separation Principle" for optimizing expected effectiveness of
classifiers decision based on separation property. Second, we present
Hierarchical Significant Words Language Models (HSWLM) which capture all, and
only, the essential features of hierarchical entities according to their
relative position in the hierarchy resulting in horizontally and vertically
separable models. Third, we validate our claims on real-world data and
demonstrate that how HSWLM improves the accuracy of classification and how it
provides transferable models over time. Although discussions in this paper
focus on the classification problem, the models are applicable to any
information access tasks on data that has, or can be mapped to, a hierarchical
structure.Comment: Full paper (10 pages) accepted for publication in proceedings of ACM
SIGIR International Conference on the Theory of Information Retrieval
(ICTIR'16
- …